Triple stranded nucleic acids

ABSTRACT

A triple-stranded nucleic acid having a first nucleic acid strand that has a region of adjacent purine nucleoside residues; a second nucleic acid strand, at least a portion of which is hydrogen bonded in a Watson-Crick manner to the region of adjacent purine nucleoside residues of the first strand; and a third nucleic acid strand, at least a portion of which is hydrogen bonded to the portion of the region of adjacent purine nucleoside residues of the first strand, the portion of the region of adjacent purine nucleoside residues to which both the second strand and the third strand are bonded defining the triple-stranded nucleic acid is disclosed.

This is a continuation of application Ser. No. 08/187,890 filed Jan. 28,1994, now U.S. Pat. No. 5,422,251, which is a continuation ofapplication Ser. No. 07/841,218 filed Feb. 27, 1992, which is acontinuation of application Ser. No. 07/622,330 filed Nov. 27, 1990, nowabandoned, which was a continuation of application Ser. No. 07/366,244filed Jun. 9, 1989, now abandoned, which was a continuation ofapplication Ser. No. 06/935,047 filed Nov. 26, 1986, now abandoned.

BACKGROUND OF THE INVENTION

This invention relates to triple-stranded nucleic acids.

Such nucleic acids have been observed in which a central polyadenylate(poly(A)) strand is hydrogen bonded to two polyuridylate (poly(U))strands (e.g., Felsenfeld et al., 26 Biochim. Biophys. Acta. 457(1957)). This triple-stranded structure has been described by Arnott etal., 244 Nature New Bio. 99 (1973), as having one poly(U) strandhydrogen bonded to the poly(A) strand in the Watson-Crick manner, andthe second poly(U) strand hydrogen bonded to the poly(A) strand in aHoogsteen-type arrangement, with the second poly(U) chain being orientedparallel to the poly(A) strand. Watson-Crick manner, as used herein,means the standard hydrogen bonding arrangement that is present indouble-stranded nucleic acids (A.T/U, G.C).

Howard et al., 246 J. Biol. Chem. 7033 (1971), describe atriple-stranded nucleic acid having a central polydeoxyadenylate(poly(dA)) strand hydrogen bonded to two polydeoxythymidylate (poly(dT))strands. Arnott et al., 3 Nuc Acid Res. 2459(1976), say that both thisstructure and poly(U).poly(A).poly(U) exist as A-type helices.

Other triple-stranded nucleic acids that have been reported include acentral polyguanylate (poly(G)) strand hydrogen bonded to twopolycytidylate (poly(C)) strands (Lipsett, 239 J. Biol. Chem. 1256(1964)); a central poly(G) strand hydrogen bonded to a poly(C) strandand a second poly(G) strand (Fresco, in Informational Macromolecules: ASymposium 121 (1963)); a central polyinosinate (poly(I)) strand hydrogenbonded to two poly(C) strands (Arnott et al., 3 Nuc. Acid Res. 2459(1976)); and a central poly(dA-dG) strand (in which deoxyadenosine anddeoxyguanosine residues alternate) hydrogen bonded to a poly(dT-dC)strand (in which deoxythymidine and deoxycytidine residues alternate)and a poly(U-C) strand (in which uridine and cytidine residuesalternate) (Morgan et al., 37 J. Mol. Bio. 63 (1968)).

SUMMARY OF THE INVENTION

The invention concerns a triple-stranded nucleic acid having a firstnucleic acid strand that has a region of adjacent purine nucleosideresidues; a second nucleic acid strand, at least a portion of which ishydrogen bonded in a Watson-Crick manner to the region of adjacentpurine nucleoside residues of the first strand; and a third nucleic acidstrand, at least a portion of which is hydrogen bonded to the portion ofthe region of adjacent purine nucleoside residues of the first strand,the portion of the region of adjacent purine nucleoside residues towhich both the second strand and the third strand are bonded definingthe triple-stranded nucleic acid.

In one aspect, the invention features such a nucleic acid in which theportion of the third strand hydrogen bonded to the region of adjacentpurine nucleoside residues includes both purine and pyrimidinenucleoside residues.

In another aspect, the invention features such a triple-stranded nucleicacid in which the portion of the third strand hydrogen bonded to theregion of adjacent purine nucleoside residues includes at least oneadenosine residue.

In another aspect, the invention features such a triple-stranded nucleicacid in which the portion of the region of adjacent purine nucleosideresidues hydrogen bonded to both the second and the third strandincludes both guanosine and adenosine residues in a manner whereby theydo not alternate along the entire length of the portion of adjacentpurine nucleoside residues.

In preferred embodiments, all of the purine nucleoside residues in theportion of the third strand hydrogen bonded to the region of adjacentpurine nucleoside residues are adenosines, guanosines, or a mixture ofadenosines and guanosines. In other preferred embodiments, eachadenosine in the region of adjacent purine nucleoside residues that ishydrogen bonded to a portion of the third strand is hydrogen bonded toeither inosine, uridine, thymidine, or adenosine in the portion, andeach guanosine in the region of adjacent purine nucleoside residues thatis hydrogen bonded to a portion of the third strand is hydrogen bondedto either inosine, cytidine, or guanosine in the portion.

In other preferred embodiments, the third strand is oriented parallel tothe first strand; the first and second strands are RNA or DNA; the thirdstrand is RNA or DNA; and the region of adjacent purine nucleosideresidues is at least 10 nucleoside residues in length.

In another aspect, the invention features a method of forming atriple-stranded nucleic acid, the method including the steps ofproviding a first nucleic acid strand that includes a region of adjacentpurine nucleoside residues; providing a second nucleic acid strand atleast a portion of which is hydrogen bonded in a Watson-Crick manner tothe region of adjacent purine nucleoside residues; providing a thirdnucleic acid strand at least a portion of which is correspondent to aportion of the region of adjacent purine nucleoside residues of thefirst strand, wherein the portion of the third strand includes at leastone guanosine, cytidine, or inosine residue; and contacting the firstand second strands with the third strand at a pH between 6 and 8 underionic conditions which allow formation of stable bonds between theguanosine, cytidine, or inosine residue in the third strand and acorresponding residue in the region of adjacent purine nucleosideresidues to allow the portion of the third strand to hydrogen bond tothe region of adjacent purine nucleoside residues to yield thetriple-stranded nucleic acid; wherein the portion of the region ofadjacent purine nucleoside residues to which both the second strand andthe third strand are bonded defines the triple-stranded nucleic acid.

In preferred embodiments the residue in the third strand, in order tohydrogen bond to its bonding partner (i.e., correspondent) in the regionof adjacent purine nucleoside residues, undergoes protonation that isenergetically unfavorable at pH 6-8, and the ionic conditions reduce theelectrostatic potential of phosphate groups in the third strand tocompensate energetically for the unfavorable protonation. In otherembodiments, ionic conditions are the presence of cations such as Mg⁺²,Mn⁺², and Ca⁺² that site bind to charged phosphate groups of the thirdnucleic strands to neutralize the charges, or the presence of cationssuch as Na⁺, Li⁺, K⁺, or tetramethylammonium that shield the chargedphosphate groups. In other embodiments, the corresponding bondingpartner in the region of adjacent purine nucleoside residues isguanosine.

In order to hydrogen bond to a region of adjacent purines in the firststrand, a third strand must have an appropriately correspondingsequence, i.e., each residue in the region of adjacent purines has acorrespondent hydrogen bonding partner in the third strand. Thefollowing matrix shows the correspondent pairing that is present in thepreferred embodiments (+ means the bases are correspondent; - means thebases are not correspondent):

    ______________________________________                                                      Third-strand Residues                                                         A     U/T    I       G   C                                      ______________________________________                                        Watson-Crick Core                                                                           A     +       +    +     -   -                                  Polypurine Strand                                                                           G     -       -    +     +   +                                  Residues                                                                      ______________________________________                                    

The terms correspondent and corresponding, as used herein, are used todifferentiate third-strand hydrogen bonding from standard Watson-Crickhydrogen bonding, in which the terms complement and complementary arecommonly used to describe A.T, A.U, and G.C pairing.

The recognition that segments of adjacent purine nucleoside residues (indouble-stranded nucleic acids) containing any (random) sequence ofadenosine and guanosine residues can serve as targets for third strandnucleic acid binding, and the determination of which bases in the thirdstrand can hydrogen bond to which purine residue correspondent makespossible a new type of assay in which denaturation of double-strandednucleic acids-containing a target sequence is unneccessary. Therecognition of the fact that, of the natural nucleic acid residues,third strand adenosine, uridine, and thymidine residues will hydrogenbond only to an adenosine residue in a target region (of adjacent purinenucleoside residues); and that third strand guanosine and cytidineresidues will hydrogen bond only to a guanosine residue in a targetregion (of adjacent purine nucleoside residues), allows for selection ofprobe sequences that are specific for any distinctive all-purine residuesequence contained in the double-stranded nucleic acid of a targetorganism. Moreover, the recognition of the fact that suchtriple-stranded structures can form under physiological conditions(i.e., those conditions found inside a cell) opens up the possibility ofcontrolling gene-expression by formation of triple-stranded structuresinside a cell.

Other features and advantages of the invention will be apparent from thefollowing description of the preferred embodiments and from the claims.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

I now describe the structure and use of the preferred embodiments, afterfirst briefly describing the drawings.

Drawings

FIG. 1 is a diagrammatic illustration of examples of triple-strandednucleic acid structures.

FIG. 2(a) is a diagrammatic illustration of the hydrogen bonding that ispresent in the adenine.uracil base pair.

FIG. 2(b) is a diagrammatic illustration of the hydrogen bonding thatmay be present in the base triplet uracil.adenine.uracil/thymine.

FIG. 3(a) is a diagrammatic illustration of the hydrogen bonding that ispresent in the guanine.cytosine base pair.

FIG. 3(b) is a diagrammatic illustration of the hydrogen bonding thatmay be present in the base triplet cytosine.guanine.cytosine.

FIG. 4 is a diagrammatic illustration of the hydrogen bonding that maybe present in the base triplet guanine.guanine.cytosine.

FIG. 5 is a diagrammatic illustration of the hydrogen bonding that maybe present in the base triplet hypoxanthine.adenine.uracil/thymine.

FIG. 6 is a diagrammatic illustration of the hydrogen bonding that maybe present in the base triplet adenine.adenine.uracil/thymine.

FIG. 7 is a mixing curve plot for the titration of poly(A) and poly(U).

FIG. 8 is a mixing curve plot for the titration of poly(A₈₆, I₁₄) andpoly(U).

FIG. 9 is a the mixing curve plot for the titration of poly(A) andpoly(U₈₀, C₂₀).

FIG. 10 is a mixing curve plot for the titration of poly(G) andpoly(C₈₈, 8BrA₁₂).

FIG. 11 is a mixing curve plot for the titration of, poly(C₇₉, I₂₁) andpoly(G).

FIG. 12 is a mixing curve plot for the titration of poly(C₈₅, G₁₅) andpoly(G).

Structure

The triple-stranded nucleic acids of the invention can form where onestrand of a standard Watson-Crick double helix has a region of generallyat least 10 adjacent purine nucleoside residues (polypurine segment). Athird strand can hydrogen bond to the polypurine segment because purinemembers of Watson-Crick pairs have two remaining hydrogen bond donor oracceptor sites available to further hydrogen bond to correspondentpurine or pyrimidine residues in the third strand. An adenosine residuein the polypurine segment can hydrogen bond to either an adenosine,uridine, thymidine, or inosine residue in the third strand. A guanosineresidue in the polypurine region can hydrogen bond to either aguanosine, inosine, or cytidine residue in the third strand. A thirdstrand can hydrogen bond to the polypurine region to form atriple-stranded nucleic acid if each residue in the polypurine segmenthas a corresponding hydrogen bonding partner in the third strand and isof the same 5' to 3' orientation or polarity as the polypurine strand(i.e., is parallel to that strand). Examples of triple-strandedstructures that can form are shown in FIG. 1.

A polypurine segment in one strand of a double-stranded DNA can hydrogenbond to either a RNA or a DNA strand that has a corresponding residuealignment. Similarly a polypurine segment in one strand of adouble-stranded RNA can hydrogen bond to either a RNA or a DNA strandthat has a corresponding residue alignment. Likewise, should adouble-stranded nucleic acid have one RNA and one DNA strand, and one ofthose strands has a polypurine segment, that segment can hydrogen bondto either a corresponding DNA or a corresponding RNA strand segment ofthe correct polarity.

FIG. 2 shows the hydrogen bonding that may be present in the basetriplet uracil.adenine.uracil (U.A.U). FIG. 2a illustrates the hydrogenbonding that occurs between adenine and uracil in a double-strandednucleic acid structure. N₁ of adenine is hydrogen bonded to N₃ ofuracil, and C₆ -N of adenine is hydrogen bonded to C₄ =0 of uracil. FIG.2b illustrates the possible hydrogen bonding of adenine in thedouble-stranded structure of FIG. 2a and a uracil in the third strand.The remaining C₆ -N proton and N₇ of the adenine are hydrogen bonded,respectfully, to N₂ and C₄ =0 of the second uracil. The tripletsthymine-adenine-uracil (T.A.U), thymine.adenine-thymine (T.A.T), anduracil.adenine.thymine (U.A.T) can have hydrogen bonding patternsanalogous to that for U.A.U.

FIG. 3 shows the presumed hydrogen bonding in the base tripletcytosine.guanine.cytosine (C.G.C). FIG. 3a illustrates the hydrogenbonding of cytosine and guanine in a standard double-stranded nucleicacid structure. N₁, C₂ -N, and C₆ =0 of guanine are. hydrogen bonded,respectively, to N₃, C₂ =0, and C₄ -N of cytosine. FIG. 3b illustratesthe bonding that can occur between the guanine of FIG. 3a and a cytosinein the third strand. That (second) cytosine must be protonated at N₃ forhydrogen bonding to occur with N₇ of guanine; C₄ -N of the secondcytosine is also hydrogen bonded to the C₆ =0 of the guanine. Thisstructure is analogous, or isosteric, with that of the U.A.U triplet.

FIG. 4 shows the hydrogen bonding that may be present in the basetriplet guanine.guanine.cytosine (G.G.C). The hydrogen bonding betweencytosine and guanine in a double-stranded structure was illustrated inFIG. 3a. The guanine in the FIG. 3a structure can bond to a secondguanine to form the triplet. For triplet formation, the second guaninemust undergo changes: N₇ should be protonated, C₆ =0 should tautomerizeto C₆ -OH, and the base must isomerize about the glycosyl bond from theanti to the syn orientation. The third requirement applies to all purinenucleoside residues in a third strand that contains both purines andpyrimidines. The syn configuration ensures that the glycosyl bondseparation distance between the polypurine backbone C_(i) and the thirdstrand backbone C_(i) is the same whether the third strand base is apurine or a pyrimidine; this allows for a regular winding of thethird-strand backbone about the double helix core. If the third strandcontains all purine residues, the configuration about the glycosyl bondsof the residues can be either all syn or all anti in order to ensureconsistent glycosyl bond separation. As shown in FIG. 4, C₆ =0 and N₇ ofthe central (double-strand) guanine are hydrogen bonded respectively, toC₆ --OH and N₇ of the third strand guanine. The triplethypoxanthine.guanine.cytosine (I.G.C) can have a hydrogen bondingpattern similar to that in the triplet G.G.C.

FIG. 5 shows the hydrogen bonding that may be present in the triplethypoxanthine.adenine.uracil (I.A.U). The hydrogen bonding betweenadenine and uracil in a double-stranded structure was illustrated inFIG. 2a. The adenine in FIG. 2a also can additionally bond tohypoxanthine to form a triplet. For triplet formation to occur, N₇ ofhypoxanthine should be protonated, and hypoxanthine should be orientedsyn about its glycosyl bond (assuming that the third strand hydrogenbonding segment of which hypoxanthine is a part contains a mixture ofpurines and pyrimidines). As shown in FIG. 5, C₆ =0 and N₇ ofhypoxanthine can hydrogen bond, respectively, to C₆ N and N₇ of adenine.In FIG. 5, uracil can be replaced by thymine, which can hydrogen bond toadenine in the same manner as uracil.

FIG. 6 shows the hydrogen bonding that may be present in the tripletadenine.adenine.uracil (A.A.U). The hydrogen bonding between adenine anduracil in a double-stranded structure was illustrated in FIG. 2a. Theadenine in FIG. 2a can also bond to a second adenine to form triplet.For triplet formation to occur, C₆ --N of the third strand adenineshould exist in the imino tautomeric form, with a tautomeric shiftoccurring such that N₇ is protonated; and the third strand adenineshould be oriented syn about its glycosyl bond (assuming that the thirda strand of which adenine is a part contains a mixture of purine andpyrimidine nucleoside residues. As shown in FIG. 6, C₆ =N and N₇ ofthird strand adenine can hydrogen bond, respectively, to C₆ --N and N₇of the central adenine. In FIG. 6, uracil can be replaced by thymine,which can bond to adenine in the same manner as uracil.

The third nucleic acid strand (that which hydrogen bonds to thedouble-stranded nucleic acid) is oriented parallel to the strandcontaining the polypurine segment and is wrapped around the doublehelix.

Triple-stranded nucleic acids exist in the A-type helix conformation.Third-strand binding to a DNA double helix, which in solution exists inthe B conformation, requires the double helix to undergo a B→Aconformational change. An RNA double helix, which already exists in theA conformation, requires no such conformational change. Two factors seemto make the A conformation preferable for triple helices. One is thatthe A conformation has a much larger major groove that can readilyaccommodate the third strand, something a B-DNA double helix cannot do.The second factor is that in B-DNA, the backbone phosphates are more orless symmetrically disposed about the helix axis. In contrast, in the Adouble helix, the two backbone strands are displaced away from the helixaxis, which allows more favorable (symmetrical) distribution of thecharges of the three backbone phosphates when the third strand isintroduced.

The stability of third-strand binding is largely based on two factors:primarily on the energy derived from base stacking, which producesstabilizing π orbital overlap between adjacent bases in the thirdstrand, and secondarily on the hydrogen bonds between each base in thethird strand and its purine correspondents in the polypurine segment ofthe double-helix. It is well known that in aqueous solution most of theenergy for holding together the strands of a nucleic acid double helixis due to the interaction of the overlapping π electrons of the basesand the tendency of the hydrophobic bases to minimize their interactionwith the solvent; the hydrogen bonding between bases of the two strandscontributes relatively little to the stability because of the highconcentration of water molecules, which are both hydrogen donors andhydrogen acceptors and therefore are strong competitors of the hydrogendonors and acceptors of the bases. In general, stacked purines providemore effective π electron overlap than stacked pyrimidines, especiallywhere a large fraction of the stacked pyrimidines are uracil or thymine.In general, then, the larger the fraction of purine residues in thethird strand, the better the π electron overlap between stacked bases islikely to be, and the more stable the three stranded structure shouldbe.

Generally, triple-stranded nucleic acids whose third strand andcorresponding polypurine segment give rise to various combinations ofbase triplets described above can be formed up to 40° C. in the pH rangeof approximately 5 to 8 under appropriate ionic conditions, preferablythe presence of some combination of site bound (e.g., Mg⁺², Ca⁺², orMn⁺²) and charge shielding (e.g., Na⁺, K⁺, or tetramethylammonium⁺)cations.

Two general considerations determine the pH stability of triple-strandednucleic acids: the effect of pH on the two strands that are hydrogenbonded in the standard Watson-Crick manner, and the effect of pH onthird strand hydrogen bonding. In the physiological temperature range(0° C.-40° C.), Watson-Crick double-stranded nucleic acid helices aregenerally stable between pH 4 and 9, with some variation due to basecomposition (as is well known to those skilled in the art) and the ioniccomposition of the medium. In the same temperature range, stablethird-strand hydrogen bonding to its corresponding polypurine segmentcan occur under comparable ionic conditions over the narrower pH rangeof 5 to 8. Above pH 8, deprotonation of third strand guanosine, uridine,thymidine, and inosine residues begins to occur, which can detabilizethird strand binding to double helices. Below pH 6, third strandguanosine, cytidine, and inosine residues protonate more readily,conferring added stability to triple helices with these residues. BelowpH 5, however, protonation of third-strand adenosine residues begins tooccur, which can destabilize the binding of third strands with adenosineresidue sequences.

In the absence of appropriate or sufficient cation, the third-strandbases guanine, cytosine, and hypoxanthine do not protonate readily abovepH 6, which weakens their binding to guanine bases of the purine segmentof a double helix. This difficulty can be overcome if sufficientconcentration of either site bound or charge shielding cations ispresent. Apparently, the cations effectively neutralize the phosphatecharges of the third strand and so reduce their electrostatic potential,which results in third-strand binding being much more energeticallyfavorable by compensating for the "cost" of protonation of third strandbases above their intrinsic pK's. Site bound cations (which generallyare multi-valent) bind directly to the phosphates and are very effectivecharge neutralizers. Charge shielding cations (which generally aremonovalent) function via Debye-Huckel shielding and are two to threeorders of magnitude less effective than site bound cations atneutralizing phosphate charges.

The concentration of cations that is necessary to stabilizetriple-stranded nucleic acids that have guanosine, cytidine, or inosineresidues in their third strands over the physiological temperature rangevaries with pH and the base composition of the third strand. The effectof pH has been discussed previously; accordingly, as the pH is loweredtowards 6, a lower concentration of multivalent cation is needed in thepresence of physiological saline (0.15M NaCl), and below pH 6, stabletriple-stranded nucleic acids can form without multivalent cationassistance. The same principles apply when only charge shielding cationsare present. In addition, the larger the fraction of guanosine,cytidine, and inosine residues in the third strand, the higher theconcentration of cation required to form stable triple-strandedstructures above pH 6. At neutral pH, at least 1 mM (preferably at least3 mM) of (unbound) site bound cations (e.g., Mg⁺²), or at least 0.5M(preferably 1M) of charge shielding cations (e.g., Na⁺) in allsituations is generally sufficient to yield stable triple-strandedstructures. It is noteworthy that the pH and ionic conditions favorableto third-strand binding mimic those found in the cell, i.e.,approximately 5 mM Mg⁺² and 0.15M Na⁺.

A triple-stranded nucleic acid can be formed in an aqueous environmentof suitable pH and ionic strength by interaction of a double-strandednucleic acid having a polypurine segment and a third strand with acorresponding sequence to the polypurine sequence; the third strandbinds to the polypurine segment in the parallel orientation to form thetriple-stranded structure. The same triple-stranded nucleic acidstructure also can be formed by mixing, in a proper aqueous environment,three nucleic acid strands, two of which are complementary in thestandard Watson-Crick manner (and which bind to form a double helix),and the third strand being correspondent (as described above) to thesequence of the polypurine segment of one of the other two strands; thethird strand will bind only if its sequence is correspondent to thepolypurine segment in the parallel orientation.

Standard mixing curve experiments have been performed to determine thestoichiometry and specificity of third strand binding (i.e., whichtriplets can form). In one type of experiment, various combinations ofpairs of homopolyribonucleotide chains were titrated. In the other type,homopolyribonucleotides were titrated with random copolynucleotides ofdetermined composition that contain as their residues both theWatson-Crick complement to the homopolymer residues and anon-complementary Watson-Crick residue. In different experiments, thecore double helix "host" Watson-Crick pairs in these combinations wereeither A.U or G.C. Copolymers were selected so as to test third-strandbinding of various bases to the purine component of each of theWatson-Crick pairs.

The endpoints in such titration experiments are indicated by abruptdiscontinuities in the changing ultraviolet absorbance with changingratio of the interacting strands as the titration proceeds. Carefuldetermination of endpoints corresponding to formation of triple-strandedhelices, and analysis of the stoichiometry of strand interaction atthose endpoints has allowed identification of third-strand residues thatare hydrogen bonded to adenosine or guanosine residues of polypurineregions of Watson-Crick helices. Such determination and analysis ofmixing curves are well known to those skilled in the art. Examples ofsuch titrations are given below.

EXAMPLE 1

FIG. 7 shows a mixing curve plot for the titration of poly(A) andpoly(U) at 4° C., pH 7, in the presence of 5 mM Mg⁺² and 0.15M Na⁺. Theplot contains three endpoints, of which two, corresponding to theformation of poly (A.U) (X_(A) =0.5) and poly (Uk.A.U) (X_(A) =0.33),were previously recognized from work in non-physiological solvents(e.g., Felsenfeld et al., 26 Biochem. Biophys. Acta 457 (1983); Blake etal., 30 J. Mol. Biol. 291 (1967)). The new endpoint, at X_(A) =0.67,corresponds to a helix containing the triplet A.A.U.

EXAMPLE 2

FIG. 8 shows the mixing curve plot for the titration of poly(A₈₆, I₁₄)(where the subscripts are the mole fractions of the residues in therandom sequence copolymer) and poly(U). The titration was performed at4° C. and neutral pH in the presence of 5 mM Mg⁺² and 0.15M Na. Theendpoint at X_(U) =0.33 corresponds to a complex containing the tripletsA.A.U, I.A.U, (and A.I.U). The endpoint at X_(U) =0.67 corresponds to acomplex containing the triplets U.A.U (and U.I.U).

EXAMPLE 3

FIG. 9 shows the mixing curve plot for the titration of poly(A) andpoly(U₈₀, G₂₀) at 4° C., pH 6.8, in the presence of 5 mM Mg⁺² and 0.15MNa⁺. The endpoint at X_(A) =0.28 is consistent with third strand Uresidues, but not third strand G residues being hydrogen bonded to thepoly(A) strand of the core helix. Hence, the potential G.A.U triplet didnot form.

EXAMPLE 4

FIG. 10 shows the mixing curve plot for titration of poly(G) andpoly(C₈₈, 8BrA₁₂) at 20° C., pH 7, in the presence of 5 mm Mg⁺² and0.15M Na⁺. The endpoint at X_(G) =0.67 corresponds to a three strandedcomplex containing the triplets G.G.C (and G.G.8BrA).

EXAMPLE 5

FIG. 11 shows the mixing curve plot for the titration of poly(C₇₉, I₂₁)and poly(G) at 4° C., pH 7, in the presence of 5 mM Mg² and 0.15M Na⁺.The endpoints corresponding to three stranded complexes fall preciselyat X_(G) =0.67. Taking into account the composition of the randomcopolymer strand participating in each interaction, the complex at X_(G)=0.33 contains 62.4% of its triplets as C.G.C, 16.6% as I.G.C, (16.6% asC.G.I and 4.4% as I.G.I); and the complex at X_(G) =0.67 contains 79% ofits triplets as G.G.C (and 21% as G.G.I).

The complex at X_(G) =0.33 contains triplets with both purine (I) andpyrimidine (C) nucleoside residues in the third stand position; thetriplets are incorporated into the triple-stranded structure in exactlythe amount predictable from the-composition of the random copolymer usedfor the titration. This demonstrates that there is no bar to having bothpurine and pyrimidine nucleoside residues in the third strand, whichpresumably is wound regularly about the core double helix.

EXAMPLE 6

FIG. 12 shows the mixing curve plot for the titration of poly(C₈₅, G₁₅)and poly(G) at 4° C., pH 7, in the presence of 5 mM Mg⁺² and 0.15M Na⁺.As in Example 5, the endpoints corresponding to three=stranded complexesfall precisely at X_(G) =0.33 and X_(G) =0.67. Taking into account thecomposition of the random copolymer strand participating in eachinteraction, the complex at X_(G) =0.33 contains 72.2% of its tripletsas C.G.C, 12.8% as G.G.C, (12.8% as C.G.G, and 2.2% as G.G.G); and thecomplex at X_(G) =0.67 contains 85% of its triplets as G.G.C and (15% asG.G.G).

The complex at X_(G) =0.33 contains triplets with both purine (G) andpyrimidine (C) nucleoside residues in the third strand position. Thetriplets are incorporated into the triple-stranded structure in exactlythe amount predictable from the composition of the random copolymer usedfor the titration. This provides further evidence that there is no barto having both purine and pyrimidine nucleoside residues in the thirdstrand.

EXAMPLE 7

The triplets C.G.C, G.G.C, and I.G.C can form at pH 7 in 1M Na⁺ or 5 mMMg⁺². To prove that third-strand residue protonation is required even atneutral pH for hydrogen bonding to G residues of the double helix,equimolar solutions of poly(G), poly(C), or poly(I) were mixed withpoly(G.C) or poly(I.C) at room temperature in unbuffered solventcontaining 5 mM Mg⁺². By scrubbing the solutions with N₂ prior to mixingto remove dissolved CO₂, and then mixing in an N₂ atmosphere, thesolutions were brought to neutral pH despite the absence of buffer atthe time of mixing. The changes in pH accompanying triple helixformation was then monitored.

If third strand binding does not require protonation, the pH shouldremain constant during triple helix formation; this was observed incontrol experiments on mixing unbuffered solutions of poly(U) andpoly(A.U). In contrast, formation of poly(C.G.C), poly(I.I.C) andpoly(C.I.C) from their respective third-strand and Watson-Crick helixprecursors was accompanied by a rise in pH, reflecting proton uptake.Moreover, when the third strand from these three-stranded helices wasdissociated upon raising the temperature, the pH dropped back to thevalue prior to mixing, as the protons abstracted from the solvent uponthird-strand binding were shed upon third-strand dissociation.

EXAMPLE 8

The specificity, kinetics, and equilibria of third-strand binding topolypurine sequences of nucleic acid double halices have been furtherexamined in column binding experiments. Various nucleic acid singlestrands, i.e., third strands, were covalently linked to agarose gels,and the resulting affinity matrices used in columns to test theircapacity to bind double stranded nucleic acids of different sequence, asa function of ionic strength, temperature, and time of interaction.These experiments have confirmed the specificity of third-strand bindingfor polypurine sequences of parallel polarity, based upon the bindingcode shown previously in the matrix in the Summary of the Inventionsection.

Use

The fact that the triple-stranded nuclei& acids can form underphysiological conditions, coupled with the fact that many organisms areknown to contain polypurine regions of 10 or more residues in theirdouble-stranded DNA (or RNA) genomes (see below), gives rise to a numberof novel uses for triple-stranded nucleic acid formation. These usesfall into several categories: (a) the use of triple-stranded nucleicacids for diagnostic and other identification and gene isolationpurposes; (b) the use of triple-stranded nucleic acid formation tocontrol gene expression in cells grown in a fermentation; (c) the use oftriple-stranded nucleic acid formation to control expression ofbacterial, vital, or eukaryotic genes in a multicellular organism, forexample, to treat, in man, animals, or plants, diseases known to beintrinsic to the genetic makeup of the infectious or host organism, orknown to be caused by viruses. These uses, which are described in moredetail below, make possible novel approaches to commonly-practicedtechnologies and the development of some new technologies.

Diagnostic Applications

Third-strand binding to target genomic nucleic acid double helices canbe used to detect a target DNA sequence, e.g., a DNA sequencecharacteristic of a particular organism in a sample. The probe is onewhich can bind to a polypurine region of the double-stranded DNA of theorganism and does not hybridize to the other, different polypurineregions that may be present in the sample.

Organisms are known to contain polypurine regions in their,double-stranded genomes (generally DNA, but RNA for certain viruses)(see below). Such regions may be identified by a computer search of agenome of an organism, if all or part of the genomic sequence is known;or, for those organisms for which the genomic sequence is not known, byusing the polyinosinate (poly(I)) binding method described below.

Having chosen an appropriate unique polypurine region as the target, asingle-stranded nucleic acid probe can be designed, using the thirdstrand binding specifications described above, that will specificallyform a triple-stranded nucleic acid segment by binding to thedouble-stranded target region. Therefore, the probe preferably shouldnot contain an inosine residue, which can bind to both adenine andguanine. Single-stranded nucleic acid probes can be prepared by standardmethods. For example, DNA probes can be synthesized by a DNAsynthesizer, and RNA probes can be obtained by first making adouble-stranded DNA, one strand of which codes for the probe, and thenusing the SP6 transcription system (for example, Promega Biotech'sRiboprobe™ system) to generate the desired RNA probe. Probes also may belabelled with a variety of labels, for example, with ³² P, by standardtechniques.

A third-strand binding assay requires that the target double-strandednucleic acid in the sample is not denatured. Third-strand binding assaysalso require that the probes are allowed to interact with thedouble-stranded nucleic acids in the presence of either 5 mM Mg⁺² or 1 MNa⁺, or similar effective charge neutralizing conditions.

Samples containing DNA to be detected are obtained according to standardmethods particular to the application. Generally, the cells are lysed,and a deproteinized aqueous extract prepared by standard techniquesunder conditions that do not denature double helical nucleic acids. Inone method, after deproteinizing with phenol or with conventionalprotease and detergent treatment, the double-stranded DNA-containingextract is brought to 5 mM Mg⁺². A single-stranded nucleic acid probe(RNA or DNA labelled with ³² P or a conventional dye-adduct) with asequence correspondent to that of the polypurine segment of interest isadded to the solution at room temperature, and the solution incubated atleast 1 hour at 4° C. to allow probe interaction with correspondentpolypurine segments in the target DNA. Since the target DNA is large,and the probe is small, standard gel filtration through a short columnof SEPHADEX® G50 (or a molecular sieve with similar exclusionproperties) is performed to trap unhybridized labelled probe on the gel,while allowing the larger double-stranded and triple-stranded structuresto pass through. The amount of label in the eluent is determined toprovide a measure of the triple helix label (the only species labelled)and thereby the quantity of target organism present in the originalsample.

Other types of third-strand binding assays may also be employed. Forexample, instead of performing the assay in solution, thedouble-stranded DNA of the sample can be attached covalently to a solidsupport (e.g., cyanogen bromide-activated SEPHAROSE® on a filter) byfollowing known procedures, and the probe applied to the attached DNAunder the appropriate conditions (e.g., pH 7, 5 mM Mg⁺², 25° C.) toallow binding of probes to target sequences. The amount of label boundto the support, following standard washing, is a measure of the amountof target organism in the original sample. Alternatively, sample DNAneed not be isolated; the assay can be performed by immobilizing cellsfrom the sample on a solid support and lysing the cells so as not todenature cellular DNA and under standard conditions where thedouble-strand DNA remains immobilized on the support.

Control of Gene Expression

Since triple-stranded nucleic acids are stable under physiologicalconditions (cells generally contain the equivalent of 5 mM divalentcation and 0.15M Na⁺), and because polypurine segments are widelydistributed in genomic sequences (see below), gene expression may becontrolled, using triple strands, at the level of transcription. Inorder for a genomic double-stranded DNA region to be expressed (i.e., tobe functionally active), it must be accessible to transcription by RNApolymerase, which requires some sort of transitory unwinding of the DNAdouble helix. Binding of a third strand to a double-stranded region (toform a stable triple helix segment) is a deterrent to such unwinding ofthe double helix (and hence to transcription), since the third strand iswound around the genomic double helix. Accordingly, a gene containing apolypurine region within its coding sequence (including intronsequences) or within or near a promoter (or enhancer element) iseffectively inactivated as a result of the binding of a third strand;either transcription will be prevented, or it will be interrupted.

Gene Control in Fermentations

Cells are often fermented in commercial manufacturing processes toproduce substances such as drugs, antibiotics, and chemicals. Productionof the desired substance is often controlled by the activity of one orjust a few genes; this can be the case for cells which naturally producethe desired substance and cells transformed with heterologous DNAmolecules encoding a protein not normally synthesized by the cell. Inmany instances, it is desirable to control gene expression in some way,for example, by repressing expression of a gene encoding a desiredprotein during the growth phase of the fermentation and inducingexpression at the end of growth phase, so that product synthesis isdelayed until the cells have reached a desired cell mass. This makes forgreater metabolic efficiency in producing first the desired cell mass,and later the desired product; moreover, overproduction of sometimestoxic or unwanted products can be avoided. In addition, repression of agene during growth avoids plasmid instability due to uncontrolledtranscription from a cloned gene. Common means of controllingheterologous gene expression include the use of promoters (e.g., the"tac" promoter, DeBoer, U.S. Pat. No. 4,551,433) or DNA rearrangements(e.g., Backman et al., 1984, Bio/Technology 2:1045-49), which areinducible by a change in growth conditions such as the addition of aninducing chemical or a temperature shift.

Third-strand binding can provide another means to control expression ofendogenous or heterologous genes in a fermentation. To controlexpression of an endogenous gene, there must be an appropriatenaturally-occurring polypurine region within or near the gene or itspromoter, most preferably within its promoter. To control expression ofa heterologous gene (generally introduced into a cell in a recombinantDNA vector), a naturally-occurring polypurine region is not necessary,since a synthetic polypurine fragment may be inserted at a desiredlocation, such as adjacent to or within the promoter, duringconstruction of a vector, using standard recombinant DNA techniques. Ineither case, the desired gene can be inactivated at the time of choiceby the introduction into the cell of an appropriate single-strandednucleic acid that forms a triple-helix by binding to the targetpolypurine region.

There are a variety of suitable ways to introduce third-strand nucleicacids into a cell. For example, if the nucleic acid is RNA, it can besynthesized by inducible transcription from the standard SP6 plasmidsystem where the plasmid has been transformed into the cells.Alternatively, RNA or DNA single strands can be packaged into liposomes,which can be introduced into the fermentation broth at the desired timeand taken up by the cells. The amount introduced need not be more than afew ng (D. A. Melton, 82 PNAS 144 (1985)).

Genes can also be activated at a chosen time by inactivating a gene thatblocks expression of the desired gene, e.g., a gene encoding a repressorprotein, which is capable of binding to a DNA region to blocktranscription (e.g., the lambda repressor protein).

Gene Control in Multicellular Organisms

Third-strand binding may also be useful in the control of geneexpression in multicellular organisms, particularly for the treatment ofhuman and animal diseases known to be linked to the expression of one ormore host genes, and also in the control of replication of viruses thatinfect cells of higher organisms. In this application, the formation ofa triple-stranded nucleic acid between double-strand genomic DNA and anintroduced third strand can be used to block gene expression, by themeans described above, in specific cells of a multicellular organism.

Several diseases are suspected of being caused by overexpression ofcertain gene products. For example, bladder cancer is known to be causedby overexpression of certain proteins. In addition, vital diseases arecaused when viruses subvert the genetic machinery of an organism'scells, causing vital genes to be expressed in the host's cells, therebycreating new infectious virus particles.

Third-strand binding can be used to treat such genetic diseases byblockage of gene expression analogously to the procedures describedabove. A suitable naturally-occurring unique polypurine region withinthe gene or near its control regions is located either through a searchof the known sequence or by the use of the poly(I) method describedbelow and an appropriate single-stranded nucleic acid designed to bindspecifically to the polypurine region as described above.

As is the case for fermentative gene control, the third strand must beintroduced into the cells affected by the disease. In general this isdone inside a living organism, although the method might be applicableto treatment of cells which have been removed from the organism, to bereturned after treatment (e.g., bone marrow cells and blood cells). Asuitable delivery system is chosen to introduce the third strand:possibilities include liposomes and retroviral vectors (Gilboa et al.,83 PNAS 3194 (1986)), which are capable of introducing genes that codefor intact single-stranded nucleic acids inside cells.

Identification of Polypurine Regions

Methods of finding polypurine regions in double-stranded nucleic acidsinclude inspecting the published or experimentally-determined basesequences of genes and even whole genomes and use of P³² -labelledshort-polyinosinate strands to screen double-stranded nucleic acids ofunknown base sequence for polypurine segments.

The complete base sequences of three genomes (Bacteriophage λ, which has48,502 base pairs; Simian virus 40 (SV40), which has 5,243 base pairs;and Adenovirus type 2, which has 35,937 base pairs), and part of thesequences (available in GENBANK) of E. coli and humans, were examinedfor polypurine regions Table 1 presents some of the results of thesesearches. For E. coli, 111 genes were examined and 225 polypurinesegments of 10 residues or more were found, an average of approximately2 polypurine segments per gene. The 50 genes in the Bacteriophage λgenome contained 52 polypurine segments of 10 residues or more, anaverage of approximately 1 polypurine segment per gene. The 6 genes ofSV40 contained 15 polypurine segments of 10 or more residues, an averageof 2.5 segments per gene. The 50 genes of Adenovirus type 2 contained 50polypurine segments of 10 or more residues, an average of one polypurinesegment per gene. Finally, the 459 human genes examined had a total of2,087 polypurine segments, an average of about 4.5 segments per gene.

                  TABLE 1                                                         ______________________________________                                                 # of genes                                                                            # of all-purine                                                       examined                                                                              segments found                                                                            segments/gene                                    ______________________________________                                        E. coli    111       225         2                                            Bacteriophage λ                                                                   50        52          1                                            SV40        6        15          2.5                                          Adenovirus type 2                                                                        50        50          1                                            Human      459       2087        4.5                                          ______________________________________                                    

Polypurine segments can be identified in organisms where DNA-basesequences are unknown by utilizing the third-strand binding propertiesof poly(I). Accordingly, using standard techniques, double-stranded DNAis isolated from the organisms and digested with standard, appropriaterestriction enzymes. Alternatively, polypurine segments may beidentified within or near genes that have been cloned by digesting thecloned DNA of the gene with restriction enzymes by standard techniques.The digested DNA is passed through a SEPHAROSE® column to which poly(I)strands are covalently bound by standard techniques. Inosine residuesbind to both adenosine and guanosine residues and therefore recognizeany polypurine segment on the double-stranded digest fragments of DNA.By running the column at 22° C., pH 7 in the presence of 5 mM Mg⁺²,those restriction fragments that contain polypurine segments bind to thepoly(I) strands, while fragments that do not contain polypurine segmentspass through the column. The temperature of the column then is raisedand/or the column is rinsed with aqueous solvent containing reducedcation concentrations to cause polypurine segments to dissociate fromthe poly(I) strands and pass through the column. These eluted fragmentsare fractionated by standard agrose gel electrophoresis techniques undernon-denaturing conditions suitable for third strand binding. Thepolypurine-containing bands are then located by radioautography afterbeing allowed to interact with ³² P-labelled poly(I) on the gel matrix,and then eluted and sequenced. With a knowledge of the polypurinesegment sequences, appropriate RNA or DNA third strands are synthesizedby standard methods. Alternatively, cellular RNA can be searched bystandard methods for the presence of natural third strands correspondingto those polypurine targets.

OTHER EMBODIMENTS

Other embodiments are within the following claims. For example, organicpolyamines such as spermine and spermidine may be used as the site boundcation to provide the appropriate ionic condition for third strandbinding to occur. Moreover, the structures of the base residues adenine,cytosine, guanine, thymine, uracil, and hypoxanthine may be modifiedslightly (to form base analogues) so long as the modification does notpreclude the hydrogen bonding schemes required by the respective basefor specific third-strand binding. In fact, the words adenine, cytosine,guanine, thymine, uracil, and hypoxanthine should be understood toencompass such slight structural modifications in the respective parentbases. Otherwise long purine segments that contain an occasionalpyrimidine should also be capable of third strand binding; accordingly,the term polypurine segment should be understood to include thosesegments in which at least 90% of the bases are purines, and theinterrupting pyrimidines in the sequence are singles (neighbored bypurines on both the 5' and 3' sides). The second nucleic acid strand maycontain an occasional nucleic acid residue (e.g., one per thousand) thatis not complementary in a Watson-Crick manner to its hydrogen bondingpartner in, the polypurine segment and still be capable of hydrogenbonding with the polypurine segment; accordingly the second strandshould be understood to include such strands.

I claim:
 1. A method of forming a triplex comprising the steps of:(a)providing a first nucleic acid strand comprising a region of adjacentpurine residues; (b) providing a second strand at least a portion ofwhich is hydrogen bonded in a Watson-Crick manner to said region ofadjacent purine residues; (c) providing a third strand comprising atleast ten residues, at least one of which is a purine residue and atleast one of which is a pyrimidine residue; and (d) contacting saidfirst and second strands with said third strand at a pH from about 5 toabout 8 so as to allow formation of hydrogen bonds between the at leastten residues of said third strand and said region of adjacent purineresidues of said first strand so as to form said triplex wherein adeninein said first strand is hydrogen bonded to one of adenine, uracil,thymine and hypoxanthine in said third strand, and guanine in said firststrand is hydrogen bonded to one of guanine, cytosine and hypoxanthinein said third strand.
 2. The method of claim 1, wherein said firststrand comprises at least one ribonucleoside residue.
 3. The method ofclaim 2, wherein said third strand comprises at least one ribonucleosideresidue.
 4. The method of claim 2, wherein said third strand comprisesat least one deoxyribonucleoside residue.
 5. The method of claim 1,wherein said first strand comprises at least one deoxyribonucleosideresidue.
 6. The method of claim 5, wherein said third strand comprisesat least one deoxyribonucleoside residue.
 7. The method of claim 5,wherein said third strand comprises at least one ribonucleoside residue.8. The method of claim 1, wherein said first strand and said thirdstrand comprise heteropolynucleotides.
 9. The method of claim 1, whereinsaid region of adjacent purine residues in the first strand is at least90% purines, and any interrupting pyrimidines are single pyrimidines.10. The method of claim 1, wherein at least one residue in the thirdstrand is a base analog.
 11. The method of claim 1, wherein the pH isfrom about 5 to about
 6. 12. A method of forming a triplex comprisingthe steps of:(a) providing a first nucleic acid strand comprising aregion of adjacent purine residues; (b) providing a second strand whichis hydrogen bonded in a Watson-Crick manner to said region of adjacentpurine residues; (c) providing a third strand comprising at least tenresidues, at least one of which is an adenine residue; and (d)contacting said first and second strands with said third strand at a pHfrom about 5 to about 8 so as to allow formation of hydrogen bondsbetween the at least ten residues of said third strand and said regionof adjacent purine residues of said first strand so as to form saidtriplex wherein said third strand contains at least one adenine which ishydrogen bonded to an adenine in said adjacent purine residues of saidfirst nucleic acid strand.
 13. The method of claim 12, wherein saidfirst strand comprises at least one ribonucleoside residue.
 14. Themethod of claim 13, wherein said third strand comprises at least oneribonucleoside residue.
 15. The method of claim 13, wherein said thirdstrand comprises at least one deoxyribonucleoside residue.
 16. Themethod of claim 12, wherein said first strand comprises at least onedeoxyribonucleoside residue.
 17. The method of claim 16, wherein saidthird strand comprises at least one deoxyribonucleoside residue.
 18. Themethod of claim 16, wherein said third strand comprises at least oneribonucleoside residue.
 19. The method of claim 12, wherein said firststrand and said third strand comprise heteropolynucleotides.
 20. Themethod of claim 12, wherein said region of adjacent purine residues isat least 90% purine residues, and any interrupting pyrimidines aresingle pyrimidines.
 21. The method of claim 12, wherein at least oneresidue in the third strand is a base analog.
 22. The method of claim12, wherein the pH is from about 5 to about 6.